NetFlix Case Study¶

Netflix is an American subscription video on-demand over-the-top streaming service. The service primarily distributes original and acquired films and television shows from various genres, and it is available internationally in multiple languages.

Launched on January 16, 2007, nearly a decade after Netflix, Inc. began its pioneering DVD‑by‑mail movie rental service, Netflix is the most-subscribed video on demand streaming media service, with 238.39 million paid memberships in more than 190 countries.By 2022, "Netflix Original" productions accounted for half of its library in the United States and the namesake company had ventured into other categories, such as video game publishing of mobile games via its flagship service. As of October 2023, Netflix is the 24th most-visited website in the world with 23.66% of its traffic coming from the United States, followed by the United Kingdom at 5.84% and Brazil at 5.64%.

image.png

Today currently Netflix stocks in down when compare are opened at 10 am

In [1]:
# These are the requied modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
In [2]:
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans, AffinityPropagation
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
import plotly as py
import plotly.graph_objs as go
import os
py.offline.init_notebook_mode(connected = True)
#print(os.listdir("../input"))
import datetime as dt
import missingno as msno
plt.rcParams['figure.dpi'] = 140

Loading Data¶

In [3]:
# loading data
df = pd.read_csv('https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/000/940/original/netflix.csv')
In [4]:
df.head()
Out[4]:
show_id type title director cast country date_added release_year rating duration listed_in description
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson NaN United States September 25, 2021 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm...
1 s2 TV Show Blood & Water NaN Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... South Africa September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t...
2 s3 TV Show Ganglands Julien Leclercq Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi... NaN September 24, 2021 2021 TV-MA 1 Season Crime TV Shows, International TV Shows, TV Act... To protect his family from a powerful drug lor...
3 s4 TV Show Jailbirds New Orleans NaN NaN NaN September 24, 2021 2021 TV-MA 1 Season Docuseries, Reality TV Feuds, flirtations and toilet talk go down amo...
4 s5 TV Show Kota Factory NaN Mayur More, Jitendra Kumar, Ranjan Raj, Alam K... India September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, Romantic TV Shows, TV ... In a city of coaching centers known to train I...
In [5]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8807 entries, 0 to 8806
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       8807 non-null   object
 1   type          8807 non-null   object
 2   title         8807 non-null   object
 3   director      6173 non-null   object
 4   cast          7982 non-null   object
 5   country       7976 non-null   object
 6   date_added    8797 non-null   object
 7   release_year  8807 non-null   int64 
 8   rating        8803 non-null   object
 9   duration      8804 non-null   object
 10  listed_in     8807 non-null   object
 11  description   8807 non-null   object
dtypes: int64(1), object(11)
memory usage: 825.8+ KB

There is only one numerical column is there that is release_year

In [6]:
df.describe()
Out[6]:
release_year
count 8807.000000
mean 2014.180198
std 8.819312
min 1925.000000
25% 2013.000000
50% 2017.000000
75% 2019.000000
max 2021.000000

Data Preparation Process¶

In [7]:
# first checking null values of every column
df.isnull().sum()
Out[7]:
show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64

only director , cast ,country, date_added ,rating have null values

In [8]:
# first we will handle the rating column
df.rating.value_counts()
Out[8]:
TV-MA       3207
TV-14       2160
TV-PG        863
R            799
PG-13        490
TV-Y7        334
TV-Y         307
PG           287
TV-G         220
NR            80
G             41
TV-Y7-FV       6
NC-17          3
UR             3
74 min         1
84 min         1
66 min         1
Name: rating, dtype: int64

There are unnecssary items added like 74 min etc.. i am replacing with highest repeated value is TV-MA

In [9]:
print(df[df.rating == '84 min'].index)
print(df[df.rating == '66 min'].index)
print(df[df.rating == '74 min'].index)
Int64Index([5794], dtype='int64')
Int64Index([5813], dtype='int64')
Int64Index([5541], dtype='int64')
In [10]:
df.loc[5541,'rating'] = 'TV-MA'
df.loc[5541,'duration'] = '74 min'
df.loc[5794,'rating'] = 'TV-MA'
df.loc[5794,'duration'] = '84 min'
df.loc[5813 ,'rating'] = 'TV-MA'
df.loc[5813,'duration'] = '66 min'
In [11]:
df.rating.value_counts()
Out[11]:
TV-MA       3210
TV-14       2160
TV-PG        863
R            799
PG-13        490
TV-Y7        334
TV-Y         307
PG           287
TV-G         220
NR            80
G             41
TV-Y7-FV       6
NC-17          3
UR             3
Name: rating, dtype: int64
In [12]:
# now we will the heat map of null values
sns.heatmap(df.isnull(),cmap='viridis')
Out[12]:
<Axes: >

->We will handle country column ,director and rating column.

->for country and director, cast column i am filling null values with No Data for getting correct values for plots.

In [13]:
df['country'] = df['country'].fillna('No Data')
df['director'] = df['director'].fillna('No Data')
df.rating = df.rating.fillna('TV-MA')
df.cast = df.cast.fillna('No Data')
df.duration = df.duration.fillna(0)

-> handling date_added column :

There are 4 techniques to handle date added column i am using forward fill by using below appraoch.

You can fill null values with a specific date or use a forward-fill (ffill) or backward-fill (bfill) strategy. For example, filling null values with a specific date:

In [14]:
df['date_added'] = df['date_added'].ffill()
In [15]:
# this count column is used for some plots
df['count'] = 1
In [16]:
df.date_added.isnull().sum()
Out[16]:
0
In [17]:
df.isnull().sum()
Out[17]:
show_id         0
type            0
title           0
director        0
cast            0
country         0
date_added      0
release_year    0
rating          0
duration        0
listed_in       0
description     0
count           0
dtype: int64

Some of columns have nested values we are un spliting into arrays and saving into another columns

In [18]:
df['split_cast'] =df.cast.str.split(', ')
df['split_listed_in'] = df.listed_in.str.split(', ')
df['split_country']  = df.country.str.split(', ')
df['split_director'] = df.director.str.split(', ')
In [19]:
df.head(2)
Out[19]:
show_id type title director cast country date_added release_year rating duration listed_in description count split_cast split_listed_in split_country split_director
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data United States September 25, 2021 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 [No Data] [Documentaries] [United States] [Kirsten Johnson]
1 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... South Africa September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 [Ama Qamata, Khosi Ngema, Gail Mabalane, Thaba... [International TV Shows, TV Dramas, TV Mysteries] [South Africa] [No Data]
In [20]:
# keeping original data as it is and copying into new data frame
Netflix_original_data = df
In [21]:
Netflix_original_data.shape
Out[21]:
(8807, 17)

we are unnesting the data from array we did perviously for spliting.

In [22]:
d1 = df.explode('split_cast')
d2 = d1.explode('split_listed_in')
d3 = d2.explode('split_director')
Netflix_un_nested_data = d3.explode('split_country').reset_index(drop=True)
In [23]:
Netflix_un_nested_data.shape
Out[23]:
(201991, 17)
In [24]:
Netflix_un_nested_data.isnull().sum()
Out[24]:
show_id            0
type               0
title              0
director           0
cast               0
country            0
date_added         0
release_year       0
rating             0
duration           0
listed_in          0
description        0
count              0
split_cast         0
split_listed_in    0
split_country      0
split_director     0
dtype: int64
In [25]:
Netflix_un_nested_data.head()
Out[25]:
show_id type title director cast country date_added release_year rating duration listed_in description count split_cast split_listed_in split_country split_director
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data United States September 25, 2021 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson
1 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... South Africa September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata International TV Shows South Africa No Data
2 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... South Africa September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Dramas South Africa No Data
3 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... South Africa September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Mysteries South Africa No Data
4 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... South Africa September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Khosi Ngema International TV Shows South Africa No Data
In [26]:
# droping necessary columns and renaming some of the columns
Netflix_un_nested_data.drop('country',axis =1,inplace = True)
Netflix_un_nested_data.rename({'split_cast' :'Actor', 'split_listed_in':'Genre', 'split_country':'country','cast':'Actors','listed_in':'Genres'},axis=1,inplace=True)
In [27]:
Netflix_un_nested_data.head(3)
Out[27]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country split_director
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data September 25, 2021 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson
1 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata International TV Shows South Africa No Data
2 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... September 24, 2021 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Dramas South Africa No Data

Null values

In [28]:
Netflix_un_nested_data.isnull().sum()
Out[28]:
show_id           0
type              0
title             0
director          0
Actors            0
date_added        0
release_year      0
rating            0
duration          0
Genres            0
description       0
count             0
Actor             0
Genre             0
country           0
split_director    0
dtype: int64

How many types of shows netfilx provide

In [29]:
# checking null values in un nested data
sns.heatmap(Netflix_un_nested_data.isnull(),cmap='viridis')
Out[29]:
<Axes: >
In [30]:
Netflix_un_nested_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 201991 entries, 0 to 201990
Data columns (total 16 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   show_id         201991 non-null  object
 1   type            201991 non-null  object
 2   title           201991 non-null  object
 3   director        201991 non-null  object
 4   Actors          201991 non-null  object
 5   date_added      201991 non-null  object
 6   release_year    201991 non-null  int64 
 7   rating          201991 non-null  object
 8   duration        201991 non-null  object
 9   Genres          201991 non-null  object
 10  description     201991 non-null  object
 11  count           201991 non-null  int64 
 12  Actor           201991 non-null  object
 13  Genre           201991 non-null  object
 14  country         201991 non-null  object
 15  split_director  201991 non-null  object
dtypes: int64(2), object(14)
memory usage: 24.7+ MB

From date_added i am extracting the month , year , month name, week _added these are helping for creating some plots

In [31]:
Netflix_un_nested_data["date_added"] = pd.to_datetime(Netflix_un_nested_data['date_added'])

Netflix_un_nested_data['month_added']=Netflix_un_nested_data['date_added'].dt.month
Netflix_un_nested_data['month_name_added']=Netflix_un_nested_data['date_added'].dt.month_name()
Netflix_un_nested_data['year_added'] = Netflix_un_nested_data['date_added'].dt.year
Netflix_un_nested_data['week_added'] = Netflix_un_nested_data['date_added'].dt.week
In [32]:
Netflix_original_data["date_added"] = pd.to_datetime(Netflix_original_data['date_added'])

Netflix_original_data['month_added']=Netflix_original_data['date_added'].dt.month
Netflix_original_data['month_name_added']=Netflix_original_data['date_added'].dt.month_name()
Netflix_original_data['year_added'] = Netflix_original_data['date_added'].dt.year
Netflix_original_data['week_added'] = Netflix_original_data['date_added'].dt.week
In [33]:
Netflix_un_nested_data.head(2)
Out[33]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country split_director month_added month_name_added year_added week_added
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data 2021-09-25 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson 9 September 2021 38
1 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata International TV Shows South Africa No Data 9 September 2021 38
In [34]:
Netflix_un_nested_data.country.nunique() # total 128 countries in the un nested data
Out[34]:
128

image.png

Using a consistent color palette is a great way to give your work credibility. It looks professional, and keeps the reader engaged.

It's an easy-to-implement tip that really helps.

Ratio of Movies & TV shows¶

1st question starts from here

In [35]:
x=Netflix_original_data.groupby(['type'])['type'].count()
y=len(Netflix_original_data)
print(x)
print(y)
type
Movie      6131
TV Show    2676
Name: type, dtype: int64
8807
In [36]:
r=((x/y)).round(3)
# print(r)
mf_ratio = pd.DataFrame(r).T
print(mf_ratio)
type  Movie  TV Show
type  0.696    0.304
In [37]:
fig,ax = plt.subplots(1,1,figsize=(6.5, 2.5))

ax.barh(mf_ratio.index, mf_ratio['Movie'],
        color='#b20710', alpha=0.9, label='Male')
ax.barh(mf_ratio.index, mf_ratio['TV Show'], left=mf_ratio['Movie'],
        color='#221f1f', alpha=0.9, label='Female')

ax.set_xlim(0, 1)
ax.set_xticks([])
ax.set_yticks([])
#ax.set_yticklabels(mf_ratio.index, fontfamily='serif', fontsize=11)


# movie percentage
for i in mf_ratio.index:
    ax.annotate(f"{int(mf_ratio['Movie'][i]*100)}%",
                   xy=(mf_ratio['Movie'][i]/2, i),
                   va = 'center', ha='center',fontsize=40, fontweight='light', fontfamily='serif',
                   color='white')

    ax.annotate("Movie",
                   xy=(mf_ratio['Movie'][i]/2, -0.25),
                   va = 'center', ha='center',fontsize=15, fontweight='light', fontfamily='serif',
                   color='white')


for i in mf_ratio.index:
    ax.annotate(f"{int(mf_ratio['TV Show'][i]*100)}%",
                   xy=(mf_ratio['Movie'][i]+mf_ratio['TV Show'][i]/2, i),
                   va = 'center', ha='center',fontsize=40, fontweight='light', fontfamily='serif',
                   color='white')
    ax.annotate("TV Show",
                   xy=(mf_ratio['Movie'][i]+mf_ratio['TV Show'][i]/2, -0.25),
                   va = 'center', ha='center',fontsize=15, fontweight='light', fontfamily='serif',
                   color='white')
# Title & Subtitle
fig.text(0.125,1.03,'Movie & TV Show distribution', fontfamily='serif',fontsize=15, fontweight='bold')
fig.text(0.125,0.92,'We see vastly more movies than TV shows on Netflix.',fontfamily='serif',fontsize=12)

for s in ['top', 'left', 'right', 'bottom']:
    ax.spines[s].set_visible(False)

# Removing legend due to labelled plot
ax.legend().set_visible(False)
plt.show()

When compare to Tv Shows Netflix has more movies. The more targeted audiances are movie audiances. That's why Netflix is more fousing on movies when compare to Tv shows.

Compare with ratings

In [38]:
Netflix_original_data.rating.value_counts()
Out[38]:
TV-MA       3214
TV-14       2160
TV-PG        863
R            799
PG-13        490
TV-Y7        334
TV-Y         307
PG           287
TV-G         220
NR            80
G             41
TV-Y7-FV       6
NC-17          3
UR             3
Name: rating, dtype: int64

frequency of rating in Netflix¶

TV-PG: Older Kids,

TV-MA: Adults,

TV-Y7-FV: Older Kids,

TV-Y7: Older Kids,

TV-14: Teens,

R: Adults,

TV-Y: Kids,

NR: Adults,

PG-13 : Teens,

TV-G: Kids,

PG: Older Kids,

G: Kids,

UR: Adults,

NC-17: Adults

In [ ]:
# targeted audiances
sns.countplot(x='rating',data = Netflix_original_data)
plt.title('How the ratings are ')
plt.xticks(rotation=45)
plt.show()

The more targeted audiances are TV-MA Adults in every country. Whenever you lauch a show or movie keep that in mind that more targeted audiances are from adult age that to 18 to 25 age people so focus more on 18 to 25 years people movies . like traning movies and more populare actors.

In [ ]:
Netflix_un_nested_data.columns
Out[ ]:
Index(['show_id', 'type', 'title', 'director', 'Actors', 'date_added',
       'release_year', 'rating', 'duration', 'Genres', 'description', 'Actor',
       'Genre', 'country', 'split_director', 'count', 'month_added',
       'month_name_added', 'year_added', 'week_added'],
      dtype='object')
In [ ]:
Netflix_original_data.columns
Out[ ]:
Index(['show_id', 'type', 'title', 'director', 'cast', 'country', 'date_added',
       'release_year', 'rating', 'duration', 'listed_in', 'description',
       'split_cast', 'split_listed_in', 'split_country', 'split_director',
       'count', 'month_added', 'month_name_added', 'year_added', 'week_added'],
      dtype='object')
In [ ]:
Netflix_original_data.director.value_counts()
Out[ ]:
No Data                           2634
Rajiv Chilaka                       19
Raúl Campos, Jan Suter              18
Suhas Kadav                         16
Marcus Raboy                        16
                                  ... 
Raymie Muzquiz, Stu Livingston       1
Joe Menendez                         1
Eric Bross                           1
Will Eisenberg                       1
Mozez Singh                          1
Name: director, Length: 4529, dtype: int64

The content added over the years¶

In [ ]:
# Assuming df is your DataFrame
d1 = Netflix_original_data[Netflix_original_data["type"] == "TV Show"]
d2 = Netflix_original_data[Netflix_original_data["type"] == "Movie"]

col = "year_added"

# Calculate counts and percentages for TV Shows
vc1 = d1[col].value_counts().reset_index()
vc1 = vc1.rename(columns={col: "count", "index": col})
vc1['percent'] = vc1['count'] * 100 / vc1['count'].sum()
vc1 = vc1.sort_values(col)

# Calculate counts and percentages for Movies
vc2 = d2[col].value_counts().reset_index()
vc2 = vc2.rename(columns={col: "count", "index": col})
vc2['percent'] = vc2['count'] * 100 / vc2['count'].sum()
vc2 = vc2.sort_values(col)

# Plot using Seaborn
# plt.figure(figsize=(12, 6))
sns.lineplot(x=col, y="count", data=vc1, label="TV Shows", marker='o', color="#a678de")
sns.lineplot(x=col, y="count", data=vc2, label="Movies", marker='o', color="#6ad49b")

plt.title("Content added over the years")
plt.xlabel(col)
plt.ylabel("Count")
plt.legend(loc="upper left")
plt.show()

The growth in number of movies on netflix is much higher than that TV shows. About 1300 new movies were added in both 2018 and 2019. The growth in content started from 2013. Netflix kept on adding different movies and tv shows on its platform over the years. This content was of different variety - content from different countries, content which was released over the years.

exploring the countries by the amount of the produces content of Netflix. We need to separate all countries within a film before analyzing it, then removing titles with no countries available.

1.Countries by the Amount of the Produces Content¶

In [39]:
filtered_countries = Netflix_original_data.set_index('title').country.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
filtered_countries.value_counts()
Out[39]:
United States     3689
India             1046
No Data            831
United Kingdom     804
Canada             445
                  ... 
Bermuda              1
Ecuador              1
Armenia              1
Mongolia             1
Montenegro           1
Length: 128, dtype: int64
In [42]:
filtered_countries = Netflix_original_data.set_index('title').country.str.split(', ', expand=True).stack().reset_index(level=1, drop=True);
# filtered_countries = filtered_countries[filtered_countries != 'Country Unavailable']
plt.figure(figsize=(13,7))
ax = sns.countplot(y = filtered_countries, order=filtered_countries.value_counts().index[:20])
ax.bar_label(ax.containers[0], fontsize=10);
plt.text(2000,4,'observation : \n United States is top country which is using netflix because netflix \n was lauched in United States. And suprisingly India is in \n second place. we have more youth when other compare to \n other countries')
plt.title('Top 20 Countries Contributor on Netflix')
plt.xlabel('Titles')
plt.ylabel('Country')
plt.title('Countries with most content')
plt.show()

Recomandation:¶

United states is in first place when compare other countries. because, Netflix is started in united states. Focus more on American series , Tv shows , movies like Action and more content over it.

2. Top directors in Netflix¶

In [44]:
Netflix_original_data.director.value_counts() # this count is wrong because see 3 one is Raul Campos, Jan Suter are combined so we need to seperate and calculate
Out[44]:
No Data                           2634
Rajiv Chilaka                       19
Raúl Campos, Jan Suter              18
Suhas Kadav                         16
Marcus Raboy                        16
                                  ... 
Raymie Muzquiz, Stu Livingston       1
Joe Menendez                         1
Eric Bross                           1
Will Eisenberg                       1
Mozez Singh                          1
Name: director, Length: 4529, dtype: int64
In [45]:
# this one is correct count i am not replacing null values with unknown director i am placing as No Data and while counting i am removing those rows
filtered_directors = Netflix_original_data[Netflix_original_data.director != 'No Data'].set_index('show_id').director.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
filtered_directors.value_counts()
Out[45]:
Rajiv Chilaka     22
Jan Suter         21
Raúl Campos       19
Suhas Kadav       16
Marcus Raboy      16
                  ..
Raymie Muzquiz     1
Stu Livingston     1
Joe Menendez       1
Eric Bross         1
Mozez Singh        1
Length: 4993, dtype: int64
In [46]:
filtered_directors = Netflix_original_data[Netflix_original_data.director != 'No Data'].set_index('show_id').director.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
plt.figure(figsize=(13,7))
plt.title('Top Director Based on The Number of Show_id')
ax = sns.countplot(y = filtered_directors, order=filtered_directors.value_counts().index[:20], palette='rocket')
ax.bar_label(ax.containers[0], fontsize=10)
plt.show()

The top most director compared to TV and Movies is Rajiv Chilaka sir

Rajiv Chilaka is the top director in the world when compare other directors in netflix ,

Rajiv Chilaka also known as Rajiv Chilakalapudi and Sitarama Rajiv Chilakalapudi is the founder and CEO of Hyderabad-based Green Gold Animations and the creator of a few cartoons including Krishna cartoon series and Chhota Bheem which has now been made into an animated series and films

3. Frequency of movies released over years¶

In [ ]:
plt.figure(figsize=(20,8))
sns.countplot(data=Netflix_original_data, x='release_year')
plt.xticks(rotation=90)
plt.title('The frequency of movies released in year based in Netflix')
plt.show()

observation:¶

when we compare the the movie released rise in from 2012 to 2018 from 2019 onwards it started decreasing because of cornea the movies are released very less.

4. Frequency of movies addded in Netfilx over the years¶

In [ ]:
plt.figure(figsize=(12,5))
ax = sns.countplot(data=Netflix_original_data,x = 'year_added')
ax.bar_label(ax.containers[0], fontsize=10)
plt.show()

observation:¶

in 2019 more movies added in netflix because of cornea time more people using it that's why.

5. Top Genrs in Netflix based on titles¶

In [47]:
filtered_genres = Netflix_original_data.set_index('title').listed_in.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
filtered_genres.value_counts()
Out[47]:
International Movies            2752
Dramas                          2427
Comedies                        1674
International TV Shows          1351
Documentaries                    869
Action & Adventure               859
TV Dramas                        763
Independent Movies               756
Children & Family Movies         641
Romantic Movies                  616
TV Comedies                      581
Thrillers                        577
Crime TV Shows                   470
Kids' TV                         451
Docuseries                       395
Music & Musicals                 375
Romantic TV Shows                370
Horror Movies                    357
Stand-Up Comedy                  343
Reality TV                       255
British TV Shows                 253
Sci-Fi & Fantasy                 243
Sports Movies                    219
Anime Series                     176
Spanish-Language TV Shows        174
TV Action & Adventure            168
Korean TV Shows                  151
Classic Movies                   116
LGBTQ Movies                     102
TV Mysteries                      98
Science & Nature TV               92
TV Sci-Fi & Fantasy               84
TV Horror                         75
Anime Features                    71
Cult Movies                       71
Teen TV Shows                     69
Faith & Spirituality              65
TV Thrillers                      57
Movies                            57
Stand-Up Comedy & Talk Shows      56
Classic & Cult TV                 28
TV Shows                          16
dtype: int64
In [49]:
filtered_genres = Netflix_original_data.set_index('title').listed_in.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
plt.figure(figsize=(10,10))
ax = sns.countplot(y = filtered_genres, order=filtered_genres.value_counts().index[:25])
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Top Genres on Netflix')
plt.text(1400,7,'As per the viz the top Genre is International Moveis \n followed by Dramas and Comedies ')
plt.xlabel('Titles')
plt.ylabel('Genres')
plt.show()

Recomandation:¶

All over the world people are intersted in international movies. Focus more on international movies and Dramas we have separate fan base like korean dramas. When coming to Anime Series are in 24 th postion .

Anime also have more fan bases sites like aniwatch have more audiances on anime movies try to add more anime famous series like Demon slayer -- > more popular now a days.

and Naruto -- > if you try add Naruto all the languages then it would more hit then movies and tv shows.

etc.. Focus more on this things.

6 . Top Actor on Netflix based on the number of titles¶

In [ ]:
filtered_cast_shows = Netflix_original_data[Netflix_original_data.cast != 'No Data'].set_index('title').cast.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
filtered_cast_shows.value_counts()
Out[ ]:
Anupam Kher                43
Shah Rukh Khan             35
Julie Tejwani              33
Naseeruddin Shah           32
Takahiro Sakurai           32
                           ..
Maryam Zaree                1
Melanie Straub              1
Gabriela Maria Schmeide     1
Helena Zengel               1
Chittaranjan Tripathy       1
Length: 36439, dtype: int64
In [ ]:
filtered_cast_shows = Netflix_original_data[Netflix_original_data.cast != 'No Data'].set_index('title').cast.str.split(', ', expand=True).stack().reset_index(level=1, drop=True)
plt.figure(figsize=(13,7))
plt.title('Top  Actors in Netflix')
plt.xlabel('Count of movies')
plt.ylabel('Actor Name')
ax = sns.countplot(y = filtered_cast_shows, order=filtered_cast_shows.value_counts().index[:25], palette='pastel')
ax.bar_label(ax.containers[0], fontsize=10)
plt.show()

Observation:¶

That would honor for our indian country because top 20 in netflix are from our indians.

because these actors have more fame when compare to others . They are more talented people.

I dont imagine he will top actor in all over the world.

7. moveis added to Netflix month over month¶

In [ ]:
ax = sns.countplot(data=Netflix_original_data, x='month_added')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Moveis added to Netflix over the months')
plt.xlabel('Months')
plt.ylabel('count of data')
plt.show()
In [50]:
sns.kdeplot(data=Netflix_original_data,x='month_added')
Out[50]:
<Axes: xlabel='month_added', ylabel='Density'>

There is an huge high in month 5 to 7 then try to add more movies on these months that makes get more audiances.

2 nd Question Comparison of TV Shows VS Movies¶

In [52]:
Netflix_movies = Netflix_un_nested_data[Netflix_un_nested_data.type == 'Movie']
Netflix_Tv_shows = Netflix_un_nested_data[Netflix_un_nested_data.type == 'TV Show']
In [53]:
print(Netflix_movies.shape)
print(Netflix_Tv_shows.shape)
print(f"Total rows are {Netflix_movies.shape[0]+Netflix_Tv_shows.shape[0]}")
(145843, 20)
(56148, 20)
Total rows are 201991
In [54]:
Netflix_movies.head()
Out[54]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country split_director month_added month_name_added year_added week_added
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data 2021-09-25 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson 9 September 2021 38
159 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data Robert Cullen 9 September 2021 38
160 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data José Luis Ucha 9 September 2021 38
161 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Kimiko Glenn Children & Family Movies No Data Robert Cullen 9 September 2021 38
162 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Kimiko Glenn Children & Family Movies No Data José Luis Ucha 9 September 2021 38
In [58]:
# top 10 countries
filtered_countries.value_counts().index[:10]
Out[58]:
Index(['United States', 'India', 'No Data', 'United Kingdom', 'Canada',
       'France', 'Japan', 'Spain', 'South Korea', 'Germany'],
      dtype='object')
In [66]:
result = Netflix_movies.groupby('country')['title'].nunique().reset_index(name='unique_titles_count')
result = result [result.country != 'No Data']
In [67]:
result = result.sort_values(by='unique_titles_count', ascending=False).head(10)
plt.figure(figsize=(12,6))
ax = sns.barplot(data=result, x='country', y= 'unique_titles_count',palette='magma')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Top 10 countries which contribute more on movies')
plt.show()

Observation¶

These are top 10 states where netflix is using more like us ,india, united kingdom suprising india is in the second postition, when compare to other foreign countries.

During the last three years, India has witnessed the fastest growing content investment from Netflix in any country globally, as the streaming giant plans to focus on developing more original Indian content after the roaring success of Originals such as Sacred Games and Lust Stories. The global OTT service provider recently announced thirteen original movies and nine series ranging from genres such as young adult and horror to drama and fantasy.

In [68]:
# top 10 directors in movies
result = Netflix_movies.groupby('director')['title'].nunique().reset_index(name='unique_titles_count')
result = result.sort_values(by='unique_titles_count', ascending=False)[1:10]
In [69]:
plt.figure(figsize=(14,6))
ax = sns.barplot(data=result, y='director', x= 'unique_titles_count',palette='magma')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Top 10 directors on movies')
plt.show()
In [70]:
Netflix_original_data.columns
Out[70]:
Index(['show_id', 'type', 'title', 'director', 'cast', 'country', 'date_added',
       'release_year', 'rating', 'duration', 'listed_in', 'description',
       'count', 'split_cast', 'split_listed_in', 'split_country',
       'split_director', 'month_added', 'month_name_added', 'year_added',
       'week_added'],
      dtype='object')
In [71]:
Netflix_movies[Netflix_movies.director == 'Rajiv Chilaka']
Out[71]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country split_director month_added month_name_added year_added week_added
10058 s407 Movie Chhota Bheem - Neeli Pahaadi Rajiv Chilaka Vatsal Dubey, Julie Tejwani, Rupa Bhimani, Jig... 2021-07-22 2013 TV-Y7 64 min Children & Family Movies Things get spooky when Bheem and his buddies t... 1 Vatsal Dubey Children & Family Movies No Data Rajiv Chilaka 7 July 2021 29
10059 s407 Movie Chhota Bheem - Neeli Pahaadi Rajiv Chilaka Vatsal Dubey, Julie Tejwani, Rupa Bhimani, Jig... 2021-07-22 2013 TV-Y7 64 min Children & Family Movies Things get spooky when Bheem and his buddies t... 1 Julie Tejwani Children & Family Movies No Data Rajiv Chilaka 7 July 2021 29
10060 s407 Movie Chhota Bheem - Neeli Pahaadi Rajiv Chilaka Vatsal Dubey, Julie Tejwani, Rupa Bhimani, Jig... 2021-07-22 2013 TV-Y7 64 min Children & Family Movies Things get spooky when Bheem and his buddies t... 1 Rupa Bhimani Children & Family Movies No Data Rajiv Chilaka 7 July 2021 29
10061 s407 Movie Chhota Bheem - Neeli Pahaadi Rajiv Chilaka Vatsal Dubey, Julie Tejwani, Rupa Bhimani, Jig... 2021-07-22 2013 TV-Y7 64 min Children & Family Movies Things get spooky when Bheem and his buddies t... 1 Jigna Bhardwaj Children & Family Movies No Data Rajiv Chilaka 7 July 2021 29
10062 s407 Movie Chhota Bheem - Neeli Pahaadi Rajiv Chilaka Vatsal Dubey, Julie Tejwani, Rupa Bhimani, Jig... 2021-07-22 2013 TV-Y7 64 min Children & Family Movies Things get spooky when Bheem and his buddies t... 1 Rajesh Kava Children & Family Movies No Data Rajiv Chilaka 7 July 2021 29
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
64929 s2718 Movie Chhota Bheem and the Curse of Damyaan Rajiv Chilaka Kaustav Ghosh, Jigna Bhardwaj, Chutki, Rajesh ... 2020-04-01 2012 TV-Y7 87 min Children & Family Movies An evil demon who traded his freedom for immor... 1 Arun Shekar Children & Family Movies India Rajiv Chilaka 4 April 2020 14
64930 s2718 Movie Chhota Bheem and the Curse of Damyaan Rajiv Chilaka Kaustav Ghosh, Jigna Bhardwaj, Chutki, Rajesh ... 2020-04-01 2012 TV-Y7 87 min Children & Family Movies An evil demon who traded his freedom for immor... 1 Julie Tejwani Children & Family Movies India Rajiv Chilaka 4 April 2020 14
64931 s2718 Movie Chhota Bheem and the Curse of Damyaan Rajiv Chilaka Kaustav Ghosh, Jigna Bhardwaj, Chutki, Rajesh ... 2020-04-01 2012 TV-Y7 87 min Children & Family Movies An evil demon who traded his freedom for immor... 1 Anamaya Verma Children & Family Movies India Rajiv Chilaka 4 April 2020 14
142214 s6298 Movie Bheemayan Rajiv Chilaka No Data 2019-05-10 2018 TV-Y 63 min Children & Family Movies It's Diwali! To celebrate, Chhota Bheem and hi... 1 No Data Children & Family Movies No Data Rajiv Chilaka 5 May 2019 19
150769 s6646 Movie Dragonkala Ka Rahasya Rajiv Chilaka No Data 2019-06-18 2018 TV-Y 68 min Children & Family Movies Bheem helps to reopen Dragonpur's abandoned ma... 1 No Data Children & Family Movies No Data Rajiv Chilaka 6 June 2019 25

128 rows × 20 columns

In [72]:
Netflix_movies.rename({'split_director':'director_name'},axis=1,inplace=True)
In [73]:
movies = Netflix_movies.groupby(['country','director_name']).title.nunique().reset_index(name='unique_title_count')
In [74]:
movies = Netflix_movies.groupby(['country','director_name'])['title'].nunique().reset_index(name='unique_titles_count').sort_values(by='unique_titles_count',ascending=False)
In [75]:
Netflix_Tv_shows.head(3)
Out[75]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country split_director month_added month_name_added year_added week_added
1 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata International TV Shows South Africa No Data 9 September 2021 38
2 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Dramas South Africa No Data 9 September 2021 38
3 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Mysteries South Africa No Data 9 September 2021 38
In [76]:
Netflix_Tv_shows.rename({'split_director':'director_name'},axis =1 ,inplace=True)
In [78]:
res = Netflix_Tv_shows.groupby('country')['title'].nunique().reset_index(name='unique_title_count')

res = res[res.country != 'No Data']

res = res.sort_values(by='unique_title_count', ascending = False)

plt.figure(figsize=(14,6))
ax = sns.barplot(data=res[:10], y='country', x= 'unique_title_count',palette='magma')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Top 10 countries on Tv_Shows')
plt.show()

observation:¶

when compare to Tv_shows indian is in not in top 10 places because from india side tv shows are not added in netflix.

In india there are more audiances are watching Tv_shows and serials like other platforms hotstar have more tv_shows when compare to netflix. Try to add serials and tv shows.

3 question¶

3.

What is the best time to launch a TV show?

a. Find which is the best week to release the Tv-show or the movie. Do the analysis

separately for Tv-shows and Movies

Hint : We expect you to create a new column and group by each week and count the total number of movies/ tv shows.

b. Find which is the best month to release the Tv-show or the movie. Do the analysis separately for Tv-shows and Movies.

Hint : We expect you to create a new column and group by each month and count the total number of movies/ tv shows.

In [79]:
# find the best week to release movie
 Netflix_movies.head()
Out[79]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country director_name month_added month_name_added year_added week_added
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data 2021-09-25 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson 9 September 2021 38
159 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data Robert Cullen 9 September 2021 38
160 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data José Luis Ucha 9 September 2021 38
161 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Kimiko Glenn Children & Family Movies No Data Robert Cullen 9 September 2021 38
162 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Kimiko Glenn Children & Family Movies No Data José Luis Ucha 9 September 2021 38
In [80]:
week_movies = Netflix_movies.groupby('week_added').title.nunique().reset_index(name='movie_count')

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))

# KDE Plot
sns.kdeplot(data=week_movies, x='week_added', y='movie_count',ax=axes[0])
axes[0].set_title('KDE Plot')

# Line Plot
sns.lineplot(data=week_movies, x='week_added', y='movie_count', ax=axes[1])
axes[1].set_title('Line Plot')

# # joint plot
# sns.jointplot(data = week_movies, y='week_added', x='movie_count', kind='scatter',ax = axes[2])
# axes[3].set_title('Join Plot')

plt.suptitle('Movies Added to Netflix Over Weeks', fontsize=16)
plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

More movies are added in week 20 to 40. means from may to october the movies are added high because of summer holidays in may and other festivals.

And also have huge circle on intial weeks of year.

My recommandation:

Try to release high budget movies on intial years and mid of weeks in year.

In [ ]:
week_Tv_shows = Netflix_Tv_shows.groupby('week_added')['title'].nunique().reset_index(name='show_count')

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))

# KDE Plot
sns.kdeplot(data=week_Tv_shows, x='week_added', y='show_count',ax=axes[0])
axes[0].set_title('KDE Plot')

# Line Plot
sns.lineplot(data=week_Tv_shows, x='week_added', y='show_count', ax=axes[1])
axes[1].set_title('Line Plot')

plt.suptitle('Movies Added to Netflix Over Weeks', fontsize=16)
plt.show()

There are more Tv_shows are released in 15 to 35 weeks.

In [81]:
data = Netflix_un_nested_data.groupby(['week_added','type']).title.nunique().reset_index(name='count')
plt.figure(figsize=(12,8))
sns.barplot(data= data, x='week_added',y='count',hue='type')
Out[81]:
<Axes: xlabel='week_added', ylabel='count'>

when compare to all weeks Movies are added when compare to tv_shows

b question which month movies and tv added

In [82]:
month_movies = Netflix_movies.groupby('month_name_added').title.nunique().reset_index(name='month_count')
month_movies
sizes = list(month_movies.month_count)
labels = list(month_movies.month_name_added)
plt.pie(sizes, labels=labels,autopct='%1.1f%%',explode=(0.3,0,0.1,0,0.1,0.2,0,0,0,0,0.1,0),shadow =True)
# plt.legend()
plt.show()
# month_movies.count

Movies are highly added in April, July, January and october have highest movies added . when compare to other months.

in april:
Avengers,Jungle Book

in july: The Dark Knight,Inception,Spider-Man: Homecoming

because of that these becomes populare months

In [84]:
month_Tv_shows = Netflix_Tv_shows.groupby('month_name_added').title.nunique().reset_index(name='month_count')
sizes = list(month_Tv_shows.month_count)
labels = list(month_Tv_shows.month_name_added)
plt.pie(sizes, labels=labels,autopct='%1.1f%%',explode=(0,0,0.3,0,0,0.2,0,0,0,0,0,0.1),shadow =True)
# plt.legend()
plt.show()

When comes to Tv_shows december , july and september have highly movies added months when compare to others.

4 Analysis of actors/directors of different types of shows/movies.¶

In [85]:
Netflix_movies.head(2)
Out[85]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country director_name month_added month_name_added year_added week_added
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data 2021-09-25 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson 9 September 2021 38
159 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data Robert Cullen 9 September 2021 38
In [86]:
Actor_movies = Netflix_movies.groupby('Actor')['title'].nunique().reset_index(name='movies_count').sort_values(by='movies_count',ascending = False)[1:16]

ax = sns.barplot(data=Actor_movies, x='movies_count', y = 'Actor')
ax.bar_label(ax.containers[0], fontsize=10)
plt.show()
In [87]:
data = Netflix_movies.groupby(['country','Actor']).title.nunique().reset_index(name='moviecount').sort_values(by='moviecount',ascending=False)

us_movies = data[data.country == 'United States'][1:10]
india_movies = data[data.country == 'India'][1:10]
uk_movies = data[data.country == 'United Kingdom'][1:10]
canada_movies = data[data.country =='Canada'][1:10]
frances_movies = data[data.country == 'France'][1:17]
germany_movies = data[data.country == 'Germany'][1:10]
germany_movies
Out[87]:
country Actor moviecount
9190 Germany Michael Maertens 4
9084 Germany Louis Held 4
9074 Germany Lina Larissa Strahl 4
9081 Germany Lisa-Marie Koroll 4
8588 Germany Charly Hübner 4
8652 Germany Daniel Brühl 4
8882 Germany Itziar Aizpuru 3
8781 Germany Francesc Orella 3
9029 Germany Kostja Ullmann 3

top 10 actors in United states,india_movies, united kingdom,canada, frances, germany

In [93]:
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))
fig.suptitle('Top 10 Actors by Country from movies', fontsize=16)

# Plot for the United States
sns.barplot(x='moviecount', y='Actor', data=us_movies, ax=axes[0, 0])
axes[0, 0].set_title('United States')

# Plot for India
sns.barplot(x='moviecount', y='Actor', data=india_movies, ax=axes[0, 1])
axes[0, 1].set_title('India')

# Plot for the United Kingdom
sns.barplot(x='moviecount', y='Actor', data=uk_movies, ax=axes[0, 2])
axes[0, 2].set_title('United Kingdom')

# Plot for Canada
sns.barplot(x='moviecount', y='Actor', data=canada_movies, ax=axes[1, 0])
axes[1, 0].set_title('Canada')

# Plot for France
sns.barplot(x='moviecount', y='Actor', data=frances_movies, ax=axes[1, 1])
axes[1, 1].set_title('Frances')

# Plot for Germany
sns.barplot(x='moviecount', y='Actor', data=germany_movies, ax=axes[1, 2])
axes[1, 2].set_title('Germany')

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

My recommandation: coming to country wise these are top 10 recommand actors from my side to added in netflix.

In [94]:
Actor_Tv_shows = Netflix_Tv_shows.groupby('Actor')['title'].nunique().reset_index(name='show_count').sort_values(by='show_count',ascending = False)[1:16]

ax = sns.barplot(data=Actor_Tv_shows, x='show_count', y = 'Actor')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Top 10 actors on netflix tv shows')
plt.show()
In [95]:
data = Netflix_Tv_shows.groupby(['country','Actor']).title.nunique().reset_index(name='showcount').sort_values(by='showcount',ascending=False)

us_tv_shows = data[data.country == 'United States'][1:10]
UK_tv_shows = data[data.country == 'United Kingdom'][1:10]
japan_tv_shows = data[data.country == 'Japan'][1:10]
SK_tv_shows = data[data.country == 'South Korea'][1:10]
canada_tv_shows = data[data.country == 'Canada'][1:10]
France_tv_shows = data[data.country == 'France'][1:10]
In [92]:
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))
fig.suptitle('Top 10 Actors by Country from Tv_shows', fontsize=16)

# Plot for the United States
sns.barplot(x='showcount', y='Actor', data=us_tv_shows, ax=axes[0, 0])
axes[0, 0].set_title('United States')

# Plot for India
sns.barplot(x='showcount', y='Actor', data=UK_tv_shows, ax=axes[0, 1])
axes[0, 1].set_title('United Kingdom')

# Plot for the United Kingdom
sns.barplot(x='showcount', y='Actor', data=japan_tv_shows, ax=axes[0, 2])
axes[0, 2].set_title('Japan')

# Plot for Canada
sns.barplot(x='showcount', y='Actor', data=SK_tv_shows, ax=axes[1, 0])
axes[1, 0].set_title('South Korea')

# Plot for France
sns.barplot(x='showcount', y='Actor', data=canada_tv_shows, ax=axes[1, 1])
axes[1, 1].set_title('Canada')

# Plot for Germany
sns.barplot(x='showcount', y='Actor', data=France_tv_shows, ax=axes[1, 2])
axes[1, 2].set_title('France')

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

My Recommandations:

These are top 10 recommanded Actors in Tv_Shows for top countries

In [96]:
data = Netflix_movies.groupby('director_name').title.nunique().reset_index(name='moviecount').sort_values(by='moviecount',ascending = False)[1:11]

ax = sns.barplot(data=data, x = 'moviecount', y='director_name')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('Top 10 directors from movies')
plt.show()
In [98]:
movies = Netflix_movies.groupby(['country','director_name'])['title'].nunique().reset_index(name='unique_titles_count').sort_values(by='unique_titles_count',ascending=False)

us_movies = movies[movies.country == 'United States'][1:10]
india_movies = movies[movies.country == 'India'][1:10]
uk_movies = movies[movies.country == 'United Kingdom'][1:10]
canada_movies = movies[movies.country =='Canada'][1:10]
frances_movies = movies[movies.country == 'France'][1:10]
germany_movies = movies[movies.country == 'Germany'][1:10]

fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))
fig.suptitle('Top 10 Directors by Country from movies', fontsize=16)

# Plot for the United States
sns.barplot(x='unique_titles_count', y='director_name', data=us_movies, ax=axes[0, 0])
axes[0, 0].set_title('United States')

# Plot for India
sns.barplot(x='unique_titles_count', y='director_name', data=india_movies, ax=axes[0, 1])
axes[0, 1].set_title('India')

# Plot for the United Kingdom
sns.barplot(x='unique_titles_count', y='director_name', data=uk_movies, ax=axes[0, 2])
axes[0, 2].set_title('United Kingdom')

# Plot for Canada
sns.barplot(x='unique_titles_count', y='director_name', data=canada_movies, ax=axes[1, 0])
axes[1, 0].set_title('Canada')

# Plot for France
sns.barplot(x='unique_titles_count', y='director_name', data=frances_movies, ax=axes[1, 1])
axes[1, 1].set_title('France')

# Plot for Germany
sns.barplot(x='unique_titles_count', y='director_name', data=germany_movies, ax=axes[1, 2])
axes[1, 2].set_title('Germany')

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()
In [99]:
data = Netflix_Tv_shows.groupby('director_name').title.nunique().reset_index(name='showcount').sort_values(by='showcount',ascending = False)[1:11]

ax = sns.barplot(data=data, x = 'showcount', y='director_name')
ax.bar_label(ax.containers[0], fontsize=10)
plt.title('top 10 directors from tv_shows')
plt.show()
In [100]:
Tv_shows = Netflix_Tv_shows.groupby(['country','director_name'])['title'].nunique().reset_index(name='unique_title_count').sort_values(by='unique_title_count', ascending = False)



us_tv_shows = Tv_shows[Tv_shows.country == 'United States'][1:10]
UK_tv_shows = Tv_shows[Tv_shows.country == 'United Kingdom'][1:10]
japan_tv_shows = Tv_shows[Tv_shows.country == 'Japan'][1:10]
SK_tv_shows = Tv_shows[Tv_shows.country == 'South Korea'][1:10]
canada_tv_shows = Tv_shows[Tv_shows.country == 'Canada'][1:10]
France_tv_shows = Tv_shows[Tv_shows.country == 'France'][1:10]


fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))
fig.suptitle('Top 10 Directors by country in Tv_shows', fontsize=16)

# Plot for the United States
sns.barplot(x='unique_title_count', y='director_name', data=us_tv_shows, ax=axes[0, 0])
axes[0, 0].set_title('United States')

# Plot for India
sns.barplot(x='unique_title_count', y='director_name', data=UK_tv_shows, ax=axes[0, 1])
axes[0, 1].set_title('United Kingdom')

# Plot for the United Kingdom
sns.barplot(x='unique_title_count', y='director_name', data=japan_tv_shows, ax=axes[0, 2])
axes[0, 2].set_title('Japan')

# Plot for Canada
sns.barplot(x='unique_title_count', y='director_name', data=SK_tv_shows, ax=axes[1, 0])
axes[1, 0].set_title('South Korea')

# Plot for France
sns.barplot(x='unique_title_count', y='director_name', data=canada_tv_shows, ax=axes[1, 1])
axes[1, 1].set_title('Canada')

# Plot for Germany
sns.barplot(x='unique_title_count', y='director_name', data=France_tv_shows, ax=axes[1, 2])
axes[1, 2].set_title('France')

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

5 Which genre movies are more popular or produced more.¶

In [101]:
Netflix_un_nested_data.head()
Out[101]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country split_director month_added month_name_added year_added week_added
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data 2021-09-25 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson 9 September 2021 38
1 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata International TV Shows South Africa No Data 9 September 2021 38
2 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Dramas South Africa No Data 9 September 2021 38
3 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Ama Qamata TV Mysteries South Africa No Data 9 September 2021 38
4 s2 TV Show Blood & Water No Data Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban... 2021-09-24 2021 TV-MA 2 Seasons International TV Shows, TV Dramas, TV Mysteries After crossing paths at a party, a Cape Town t... 1 Khosi Ngema International TV Shows South Africa No Data 9 September 2021 38
In [ ]:
!pip install WordCloud
Requirement already satisfied: WordCloud in /usr/local/lib/python3.10/dist-packages (1.9.3)
Requirement already satisfied: numpy>=1.6.1 in /usr/local/lib/python3.10/dist-packages (from WordCloud) (1.23.5)
Requirement already satisfied: pillow in /usr/local/lib/python3.10/dist-packages (from WordCloud) (9.4.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (from WordCloud) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (4.46.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (23.2)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (3.1.1)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib->WordCloud) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib->WordCloud) (1.16.0)
In [102]:
from wordcloud import WordCloud
import matplotlib.pyplot as plt

# Assuming 'genre' is the column containing genres in your DataFrame
genres_text = ' '.join(Netflix_un_nested_data['Genre'].dropna())

# Create and generate a word cloud image
wordcloud = WordCloud(width=800, height=400, background_color='white').generate(genres_text)

# Display the generated word cloud image
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Word Cloud of Movie Genres')
plt.show()

From the above image Internatinal movies , Movies comdies and children Family,Movie drams and movie international are more famous that why they are poped up high when compare to others.

In [105]:
from sklearn.preprocessing import MultiLabelBinarizer

import matplotlib.colors


# Custom colour map based on Netflix palette
cmap = matplotlib.colors.LinearSegmentedColormap.from_list("", ['#221f1f', '#b20710','#f5f5f1'])



def genre_heatmap(df, title):
    df['genre'] = df['listed_in'].apply(lambda x :  x.replace(' ,',',').replace(', ',',').split(','))
    Types = []
    for i in df['genre']: Types += i
    Types = set(Types)
    print("There are {} types in the Netflix {} Dataset".format(len(Types),title))
    test = df['genre']
    mlb = MultiLabelBinarizer()
    res = pd.DataFrame(mlb.fit_transform(test), columns=mlb.classes_, index=test.index)
    corr = res.corr()
    mask = np.zeros_like(corr, dtype=np.bool)
    mask[np.triu_indices_from(mask)] = True
    fig, ax = plt.subplots(figsize=(10, 5))
    fig.text(.54,.88,'Genre correlation', fontfamily='serif',fontweight='bold',fontsize=15)

    pl = sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, vmin=-.3, center=0, square=True, linewidths=2.5)

    plt.show()
In [106]:
df_tv = Netflix_original_data[Netflix_original_data["type"] == "TV Show"]
df_movies = Netflix_original_data[Netflix_original_data["type"] == "Movie"]


genre_heatmap(df_movies, 'Movie')
plt.show()
There are 20 types in the Netflix Movie Dataset

It is interesting that Independant Movies tend to be Dramas.

Another observation is that Internatinal Movies are rarely in the Children's genre.

when combination of documentaires and drames have negative correlation.

All the white cells have higher correlation when comapare to others.

In [107]:
genre_heatmap(df_tv, 'TV')
plt.show()
There are 22 types in the Netflix TV Dataset

international tv shows combiantion with kids tv shows have negative correlation donot try this combination.

all white cells have higher correlation when compare to others

6 Find After how many days the movie will be added to Netflix after the release of the movie (you can consider the recent past data)¶

In [108]:
Netflix_movies.head()
Out[108]:
show_id type title director Actors date_added release_year rating duration Genres description count Actor Genre country director_name month_added month_name_added year_added week_added
0 s1 Movie Dick Johnson Is Dead Kirsten Johnson No Data 2021-09-25 2020 PG-13 90 min Documentaries As her father nears the end of his life, filmm... 1 No Data Documentaries United States Kirsten Johnson 9 September 2021 38
159 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data Robert Cullen 9 September 2021 38
160 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Vanessa Hudgens Children & Family Movies No Data José Luis Ucha 9 September 2021 38
161 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Kimiko Glenn Children & Family Movies No Data Robert Cullen 9 September 2021 38
162 s7 Movie My Little Pony: A New Generation Robert Cullen, José Luis Ucha Vanessa Hudgens, Kimiko Glenn, James Marsden, ... 2021-09-24 2021 PG 91 min Children & Family Movies Equestria's divided. But a bright-eyed hero be... 1 Kimiko Glenn Children & Family Movies No Data José Luis Ucha 9 September 2021 38
In [109]:
Netflix_movies['Days_to_Netflix'] = (Netflix_movies['year_added'] - Netflix_movies['release_year'])


# plt.figure(figsize=(10, 6))
plt.hist(Netflix_movies['Days_to_Netflix'], bins=20, color='skyblue', edgecolor='black')
plt.title('Distribution of Days to Netflix')
plt.xlabel('Days to Netflix')
plt.ylabel('Frequency')
plt.show()
In [110]:
Netflix_movies['Days_to_Netflix'].mode()[0] # there are more 0  added more and it is some confusion about this question i did cool vizualtion at the end have a look into it
Out[110]:
0
In [112]:
Netflix_movies['Days_to_Netflix'].mean()
Out[112]:
6.8207867364220425

On an averate per 6 it is addeding more movies

In [113]:
# plt.figure(figsize=(10, 6))
plt.boxplot(Netflix_movies['Days_to_Netflix'])
plt.title('Boxplot of Days to Netflix')
plt.ylabel('Days to Netflix')
plt.show()

It has more outlier when greater than 20

In [115]:
### Relevant groupings

data = Netflix_movies.groupby('country')[['country','count']].sum().sort_values(by='count',ascending=False).reset_index()[:10]
data = data['country']
df_loli = Netflix_movies.loc[Netflix_movies['country'].isin(data)]

loli = df_loli.groupby('country')['release_year','year_added'].mean().round()


# Reorder it following the values of the first value
ordered_df = loli.sort_values(by='release_year')

ordered_df_rev = loli.sort_values(by='release_year',ascending=False)

my_range=range(1,len(loli.index)+1)


fig, ax = plt.subplots(1, 1, figsize=(7, 5))

fig.text(0.13, 0.9, 'How old are the movies? [Average]', fontsize=15, fontweight='bold', fontfamily='serif')
plt.hlines(y=my_range, xmin=ordered_df['release_year'], xmax=ordered_df['year_added'], color='grey', alpha=0.4)
plt.scatter(ordered_df['release_year'], my_range, color='#221f1f',s=100, alpha=0.9, label='Average release date')
plt.scatter(ordered_df['year_added'], my_range, color='#b20710',s=100, alpha=0.9 , label='Average added date')
#plt.legend()

for s in ['top', 'left', 'right', 'bottom']:
    ax.spines[s].set_visible(False)


# Removes the tick marks but keeps the labels
ax.tick_params(axis=u'both', which=u'both',length=0)
# Move Y axis to the right side
ax.yaxis.tick_right()

plt.yticks(my_range, ordered_df.index)
plt.yticks(fontname = "serif",fontsize=12)

# Custome legend
fig.text(0.19,0.175,"Released", fontweight="bold", fontfamily='serif', fontsize=12, color='#221f1f')
fig.text(0.76,0.175,"Added", fontweight="bold", fontfamily='serif', fontsize=12, color='#b20710')




#plt.xlabel('Year')
#plt.ylabel('Country')
plt.show()

The average gap between when content is released, and when it is then added on Netflix varies by country.

In Spain, Netflix appears to be dominated by newer movies whereas Germany, United States , United Kingdom & India have an older average movie.

In [116]:
data = Netflix_Tv_shows.groupby('country')[['country','count']].sum().sort_values(by='count',ascending=False).reset_index()[:10]
data = data['country']
df_loli = df_tv.loc[df_tv['country'].isin(data)]

loli = df_loli.groupby('country')['release_year','year_added'].mean().round()


# Reorder it following the values of the first value:
ordered_df = loli.sort_values(by='release_year')

ordered_df_rev = loli.sort_values(by='release_year',ascending=False)

my_range=range(1,len(loli.index)+1)


fig, ax = plt.subplots(1, 1, figsize=(7, 5))

fig.text(0.13, 0.9, 'How old are the TV shows [Average]', fontsize=15, fontweight='bold', fontfamily='serif')
plt.hlines(y=my_range, xmin=ordered_df['release_year'], xmax=ordered_df['year_added'], color='grey', alpha=0.4)
plt.scatter(ordered_df['release_year'], my_range, color='#221f1f',s=100, alpha=0.9, label='Average release date')
plt.scatter(ordered_df['year_added'], my_range, color='#b20710',s=100, alpha=0.9 , label='Average added date')
#plt.legend()

for s in ['top', 'left', 'right', 'bottom']:
    ax.spines[s].set_visible(False)

ax.yaxis.tick_right()
plt.yticks(my_range, ordered_df.index)
plt.yticks(fontname = "serif",fontsize=12)


fig.text(0.19,0.175,"Released", fontweight="bold", fontfamily='serif', fontsize=12, color='#221f1f')

fig.text(0.47,0.175,"Added", fontweight="bold", fontfamily='serif', fontsize=12, color='#b20710')


fig.text(0.13, 0.42,
'''The gap for TV shows seems
more regular than for movies.

This is likely due to subsequent
series being released
year-on-year.

Spain seems to have
the newest content
overall.
'''

, fontsize=12, fontweight='light', fontfamily='serif')


ax.tick_params(axis=u'both', which=u'both',length=0)
#plt.xlabel('Value of the variables')
#plt.ylabel('Group')
plt.show()
In [126]:
ratings_ages = {
    'TV-PG': 'Older Kids',
    'TV-MA': 'Adults',
    'TV-Y7-FV': 'Older Kids',
    'TV-Y7': 'Older Kids',
    'TV-14': 'Teens',
    'R': 'Adults',
    'TV-Y': 'Kids',
    'NR': 'Adults',
    'PG-13': 'Teens',
    'TV-G': 'Kids',
    'PG': 'Older Kids',
    'G': 'Kids',
    'UR': 'Adults',
    'NC-17': 'Adults'
}

Netflix_un_nested_data['target_ages'] = Netflix_un_nested_data['rating'].replace(ratings_ages)
In [127]:
data = Netflix_un_nested_data.groupby('country')[['country','count']].sum().sort_values(by='count',ascending=False).reset_index()[:10]
data = data['country']


df_heatmap = Netflix_un_nested_data.loc[Netflix_un_nested_data['country'].isin(data)]
In [128]:
df_heatmap = pd.crosstab(df_heatmap['country'],df_heatmap['target_ages'],normalize = "index").T
In [131]:
fig, ax = plt.subplots(1, 1, figsize=(12, 12))

country_order2 = ['United States', 'India', 'United Kingdom', 'Canada', 'Japan', 'France', 'South Korea', 'Spain']

age_order = ['Kids','Older Kids','Teens','Adults']

sns.heatmap(df_heatmap.loc[age_order,country_order2],cmap=cmap,square=True, linewidth=2.5,cbar=False,
            annot=True,fmt='1.0%',vmax=.6,vmin=0.05,ax=ax,annot_kws={"fontsize":12})

ax.spines['top'].set_visible(True)


fig.text(.99, .725, 'Target ages proportion of total content by country', fontweight='bold', fontfamily='serif', fontsize=15,ha='right')
fig.text(0.99, 0.7, 'Here we see interesting differences between countries. Most shows in India are targeted to teens, for instance.',ha='right', fontsize=12,fontfamily='serif')

ax.set_yticklabels(ax.get_yticklabels(), fontfamily='serif', rotation = 0, fontsize=11)
ax.set_xticklabels(ax.get_xticklabels(), fontfamily='serif', rotation=90, fontsize=11)

ax.set_ylabel('')
ax.set_xlabel('')
ax.tick_params(axis=u'both', which=u'both',length=0)
plt.tight_layout()
plt.show()

From Uk,Us adults watching more netflix

from india Teens are showing more netflix

Some of the suggested recommandations to follow :

Diverse Storytelling:¶

Embrace diversity in storytelling to appeal to a broad audience. Include narratives that represent different cultures, backgrounds, and perspectives. Interactive Content:

Interactive Content:¶

Experiment with more interactive content like "Bandersnatch," providing viewers with the ability to make choices that impact the storyline. This engagement can enhance the viewing experience.

Invest in Original Content:¶

Continue investing in high-quality original content. Exclusive shows and movies can set Netflix apart from other platforms and attract subscribers.

User Feedback Integration:¶

Consider gathering feedback from users and integrating it into content creation. Understanding audience preferences can guide decisions on what types of shows and movies to produce.

Genre Variety:¶

Maintain a diverse range of genres to cater to different tastes. Offer a mix of drama, comedy, sci-fi, fantasy, documentaries, etc., to ensure there's something for everyone.

Collaborate with Emerging Talent:¶

Collaborate with emerging filmmakers, directors, and writers. This can bring fresh and innovative perspectives to the platform.

Invest in Visual Excellence:¶

Continue to invest in high production values and cutting-edge visual effects. Stunning visuals can enhance the overall viewer experience.

Global Appeal:¶

Create content with global appeal. Consider stories that can resonate with audiences worldwide, transcending cultural and geographical boundaries.

Data-Driven Decision-Making:¶

Leverage data analytics to understand viewer behavior. Analyzing what works and what doesn't can guide content decisions and help in tailoring recommendations.

Content Curation:¶

Improve content curation algorithms to provide more personalized recommendations. This can help users discover content that aligns with their interests.

Innovative Marketing:¶

Develop innovative marketing strategies to promote new releases. Create buzz through social media, influencers, and other channels to increase awareness. Address Social Issues:

Explore content that addresses important social issues. This can not only contribute positively to society but also resonate with viewers who appreciate socially conscious storytelling.

Continued Adaptation:¶

Stay attuned to industry trends and viewer preferences. Be willing to adapt and evolve content strategies based on changing dynamics in the entertainment landscape. Remember that the entertainment industry is dynamic, and what resonates with audiences can change over time. A combination of creativity, adaptability, and a deep understanding of your audience can contribute to the ongoing success and improvement of Netflix's content.

In [ ]: